The PhyLoTA Browser: processing GenBank for molecular phylogenetics research.

نویسندگان

  • Michael J Sanderson
  • Darren Boss
  • Duhong Chen
  • Karen A Cranston
  • Andre Wehe
چکیده

As an archive of sequence data for over 165,000 species, GenBank is an indispensable resource for phylogenetic inference. Here we describe an informatics processing pipeline and online database, the PhyLoTA Browser (http://loco.biosci.arizona.edu/pb), which offers a view of GenBank tailored for molecular phylogenetics. The first release of the Browser is computed from 2.6 million sequences representing the taxonomically enriched subset of GenBank sequences for eukaryotes (excluding most genome survey sequences, ESTs, and other high-throughput data). In addition to summarizing sequence diversity and species diversity across nodes in the NCBI taxonomy, it reports 87,000 potentially phylogenetically informative clusters of homologous sequences, which can be viewed or downloaded, along with provisional alignments and coarse phylogenetic trees. At each node in the NCBI hierarchy, the user can display a "data availability matrix" of all available sequences for entries in a subtaxa-by-clusters matrix. This matrix provides a guidepost for subsequent assembly of multigene data sets or supertrees. The database allows for comparison of results from previous GenBank releases, highlighting recent additions of either sequences or taxa to GenBank and letting investigators track progress on data availability worldwide. Although the reported alignments and trees are extremely approximate, the database reports several statistics correlated with alignment quality to help users choose from alternative data sources.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BioRuby: bioinformatics software for the Ruby programming language

SUMMARY The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs...

متن کامل

Op-molb130065 1720..1728

Since its first release in 2001 as mainly a software package for phylogenetic analysis, data analysis for molecular biology and evolution (DAMBE) has gained many new functions that may be classified into six categories: 1) sequence retrieval, editing, manipulation, and conversion among more than 20 standard sequence formats including MEGA, NEXUS, PHYLIP, GenBank, and the new NeXML format for in...

متن کامل

Molecular Phylogenetics and Evolution

Mitochondrial DNA sequence data from the control region and 12S rRNA in leopard frogs from the Sierra El Aguaje of southern Sonora, Mexico, together with GenBank sequences, were used to infer taxonomic identity and provide phylogenetic hypotheses for relationships with other members of the Rana pipiens complex. We show that frogs from the Sierra El Aguaje belong to the Rana berlandieri subgroup...

متن کامل

Revisiting the phylogeny of phylum Ctenophora: a molecular

The phylogenetic relationships of deep metazoans, specifically in the phylum Ctenophora, are not totally understood. Previous studies have been developed on this subject, mostly based on morphology and single gene analyses (rRNA sequences). Several loci (protein coding and ribosomal RNA) from taxa belonging to this phylum are currently available on public databases (e.g. GenBank). Here we revis...

متن کامل

Molecular phylogeny of three desert truffles from Iran based on ribosomal genome

The ITS region including the 5.8S gene of rDNA of three desert truffle species were amplified using ITS4 and ITS1 primers. The ITS sequences were compared to those of other related authentic sequences obtained from GenBank. Among 12 specimens studied, seven isolates corresponded to Terfezia claveryi reported by other authors. Iranian T. claveryi specimens had an average similarity of 99.4% (ran...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Systematic biology

دوره 57 3  شماره 

صفحات  -

تاریخ انتشار 2008